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rvj ' We study the random if-satisfiabihty problem using a partition function where each solution is 

reweighed according to the number of variables that satisfy every clause. We apply belief propagation 

f^ . and the related cavity method to the reweighed partition function. This allows us to obtain several 

^S) ' new results on the properties of random i^-satisfiability problem. In particular the reweighing allows 

to introduce a planted ensemble that generates instances that are, in some region of parameters, 
Oj, equivalent to random instances. We are hence able to generate at the same time a typical SAT 

.^1^ ' instance and one of its solutions. We study the relation between clustering and belief propagation 

' fixed points and we give a direct evidence for the existence of purely entropic (rather than energetic) 

I*^ , barriers between clusters in some region of parameters in the random iiT-satisfiability problem. We 

show explicitly how to find solutions of random K-SAT leading to a non-trivial whitening core; 
such solutions were known to exist but were so far never found on large instances. Finally, we 
discuss algorithmic hardness of such planted instances and determine a region of parameters in 
which planting leads to satisfiable benchmarks that, up to our knowledge, are the hardest known. 
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I. MOTIVATION 

The satisfiability of Boolean formulas is a fundamental problem in theoretical computer science. It was the first 
problem shown to be NP-complete [l|, |2|, and it is of central relevance in various practical applications, including 
artificial intelligence, planning, hardware and electronic design, automation, verification and more. It can thus be 
thought of as the "Ising model" of computer science. Ensembles of randomly generated instances of the satisfiability 
problem have emerged in computer science as a way of evaluating algorithmic performance and addressing questions 
regarding the average case complexity. 

An instance of the random "ivT-SAT" problem consists of N Boolean variables and M clauses. Each clause contains 
a subset of K distinct variables chosen uniformly at random, and each clause forbids one random assignment of these 
K variables out of the 2^ possible ones. The problem is satisfiable if there exists an assignment of variables that 
simultaneously satisfies all clauses. In this case we call such an assignments a solution to the problem. When the 
density of constraints a = M/N is increased, the formulas become less likely to be satisfiable. In the thermodynamic 
limit where iV — > oo at fixed a, there is a sharp transition from a phase in which the formulas are almost surely 
satisfiable to a phase where they are almost surely unsatisfiable Q . It is also a well known empirical result that the 
hardest instances are found near to this threshold [^-Q- 

Random iiT-SAT has attracted the interest of mathematicians, computer scientists and statistical physicists. One 
very fruitful methodological direction to study random i^-SAT is the belief propagation (BP) algorithm [7, 8], which 
is closely related to the cavity method that was developed in statistical physics for studies of mean field spin glasses 
Q. The results and insights coming from the cavity method are remarkable. The satisfiability threshold and other 
phase transitions in the structure of solutions have been described in |10l4l3l |. In particular, it was shown that for 
K > 3 the space of solutions for highly constrained but still satisfiable instances splits into exponentially many clusters 
which are far away from each other, and in some cases this clustering has been rigorously confirmed [i3j[3- Another 
important concept studied recently in random iiT-SAT [16l - [la | is the one of frozen variables: a given variable is frozen 
relatively to a given cluster of solutions if the variable is fixed to the same value in all the solutions of this cluster. 
Clearly any local search procedure which approaches a cluster must identify correctly the frozen variables. 

In spite of this nice corpus of results obtained in recent years, there are still many open questions in random K-SAT, 
concerning the relation between detailed properties of the energy landscape as predicted by the theory on one hand, 
and what is found in algorithms on the other hand. In the limit of large K , the situation is relatively well under 
control [13. . .19.] : phase transitions in the space of solutions can be identified with a threshold beyond which many 
simple algorithms (or at least the methods used to analyze them) fail. However, for small K the algorithmic limits 
are strongly algorithm-dependent, and many algorithmically relevant questions about the energy landscape are still 
open. In particular, relatively simple stochastic local search algorithms have been found to be efficient in the regimes 
of the density of constraint where clustering occurs 20]. Another annoying observation is that so far no one has 
been able to find solutions of random iiT-SAT with frozen variables, in spite of the fact that theory clearly predicts 
their existence [ll|, [21| at large enough density of constraints. Another aspect of this paradox is the fact that it was 
impossible so far to find non-trivial fixed points of BP, while the theory shows that there are exponentially many of 
them (associated with clusters of solutions) , and actually an algorithm like survey propagation precisely counts the 
BP fixed points associated with frozen clusters [11[. 

In this work we shall address some of these issues by developing a general 'quiet planting' procedure for the random 
iiT-SAT problem. Planting is a way to generate a i^-SAT instance with a solution. Instead of generating the clauses 
randomly and then trying to find a solution, one first generates a special random configuration of variables, which we 
call the 'planted configuration' and then one generates the clauses in such a way that each clause is satisfied by the 
planted configuration. By construction the planted configuration is a solution of the instance so generated, one says 
that it has been 'planted' in the instance. Unfortunately, the naive planting that we have just described generates 
some instances which are very distinct from typical random ii'-SAT instances. A more careful procedure, called 
'quiet planting', allows to generate A'-SAT instances which are generated with the same distribution as fully random 
instances (when the density of constraints is below a certain threshold). Therefore quiet planting is a procedure that 
provides a generic instance together with a solution. So far quiet planting had been developed for other constraint 
satisfaction problems |191 [22h24| . In this paper we give a general study of quiet planting in JC-SAT. In order to study 
its properties, we introduce and study the reweighted belief propagation and the associated cavity method results. 
This study allows to answer several of the important open questions. More specifically, we adress the following points: 

• We find non-trivial BP fixed points by initializing BP in a quietly-planted configuration. This shows that such 
non-trivial solutions exist, as expected from the cavity method analysis. This answers a long-lasting question 
asked by random K-SAT practitioners who could not find these non-trivial BP fixed-points. They exist, but 
they are hard to find, and one needs a special procedure like planting in order to see them. 

• Using the reweighted BP we explicitly show that purely entropic barriers do exist in random .ftT-SAT. A commonly 



used definition of clusters is the following: given a ii'-SAT formula, construct a graph where each solution is 
a vertex, and two solutions are connected by an edge if they differ in exactly one variable. Then clusters are 
connected components of this graph. It is known that, with this definition, well separated clusters exist (in the 
appropriate range of a), when K is large enough. However, this definition (and simple generalizations of it) is 
too restrictive in general. It neglects completely the possibility that two different clusters can be in fact sets of 
solutions connected by a very narrow path, leading to entropic barriers instead of energetic barriers. Entropic 
barriers have been known to exist in simple spin glass models (see for instance [25l.l26j). but it is the first time 
that they are found in X-SAT. 

• An interesting problem which has been studied on several occasions in the literature |27l . |28| is: how to plant 
a solution in a 3-SAT problem such that the satisfiability problem is nevertheless very hard. In this paper we 
clarify in which region the planted instances are hard. Qualitatively we reach the same conclusions as [27|. We 
give a correct boundary of the hard region (whereas the calculations in [27| were only approximate) and we 
provide more detailed theoretical arguments for this result. In the end we have a way to create the hardest 
known (at least to the authors) satisfiable 3-SAT formulas. Note that the hard ensemble discussed here is closely 
related to the one of [29j . 

II. REWEIGHTING IN RANDOM A'-SAT 

The satisfiability problem is defined over N Boolean variables Si G {0, 1}, and M clauses. Each clause contains K 
variables. If a variable i is negated in the clause a we set Jia = 1, otherwise we set Jia = 0. A clause a is satisfied if 
and only if 

na=^(l-^s..J.J>0, (1) 

where da is the set of variables belonging to the clause a. Here we introduced Ua as the number of variables that are 
satisfying the clause a. 

Let us first define the standard measure. We assume that there exists at least one solution, and we define the 
partition function as the number of solutions: 

^=EnCa({5.}.Gaa), (2) 

where Ca == 1 if clause a is satisfied and Ca = otherwise. The sum is over all the 2^ assignments of variables. One 
introduces a 'Boltzmann'-type measure as the uniform measure over all solutions: 

M■^^^}) = ^ll^ai{s,heOa)■ (3) 

a 

The cavity method (or BP) can be used to study the properties of this 'standard' measure 

Let us now define a reweighted measure. We introduce r parameters Ai, . . . , A^ which are non-negative real numbers, 
and we define the 'A-measure' as: 

^^x{{s^}) = -^7^ n [Ca({s»}»eaa) A„„({s,},ga„)] , (4) 

Where the reweighted partition function is defined as 

^(^) = E n [Ca{{s^}^e^a) A„4{.. }.,,„)] . (5) 

We denote the vector of A = (Ai, A2, . . . , Xk)- In order to have a unique measure associated with each vectors A, 
we choose to fix the normalization 

i=x:(f)A.. (6) 

The standard measure and partition function are recovered when all components of A are equal: Vr A^ = 1/(2^ ^ !)■ 



The main subject of this paper is to generahze the usual phase diagram of random isT-SAT obtained with the 
cavity method as a function of the parameter A and discuss new interesting results which can be derived using this 
generalized measure. There are some properties that do not depend on the reweighting parameters A (as long as 
Ar > for all r = 1, . . . ^K), for instance the satisfiability threshold or the freezing transition at which strictly all 
clusters start to contain frozen variables. But other physically important phase transition defined in |13l |l7[ such as 
the clustering or the condensation transitions do depend on reweighting parameters. 

Note that special cases of the reweighted partition function were studied previously. In particular, [30| used the set 
of reweighting parameters 






K = T^K^jj^^ (7) 



as a tool to be able to use the second moment method to obtain a lower bound on the satisfiability threshold. Indeed, 
the success of this reweighting in the second moment calculations inspired our study. In fact, in the region where 
a second moment lower bound can be proven, the quenched and annealed entropy densities are equal and this is a 
condition for the quiet planting to work. The same reweighting was later used in several other rigorous works in 
conjunction with the second moment method. Let us also mention the work of [28| where planting according to the 
reweighting ([T]) was studied numerically and using the first and second moments calculations. However, it seems that 
the authors of pS] did not notice that for a certain value of the reweighting parameter the generated instances are 
equivalent to random instances, which is one of the crucial points in our approach. Reweighted planting was also 
studied in '27] as a way to create hard instances. We shall comment on all these results in the light of our findings. 

III. BELIEF PROPAGATION FOR THE REWEIGHTED PARTITION FUNCTION 

The reweighted partition function ([S]) can be computed via BP, which is exact of trees and corresponds to the 
replica symmetric approximation on sparse random graphs. Considering the Boltzmann probability measure Q, we 
define V's~^* as the probability that the constraint a is satisfied, conditioned to the fact that the value of variable i is 
Si. Similarly, xi"*^" is the probability that the variable j takes value Sj conditioned to the fact that the constraint a 
has been removed from the graph. These messages then satisfy the set of equations 

\Abedi\a V'l^' + \\bedi\a V'O^* 

^r^ = ^ E ^(^-1- E '5..,.,)Ak-i-i:,..„,.^.„.„, n xi^' (9) 

{sj}jeea\i jeda\i jeda\i 

^VL = ^ E ^k-j:^.s.,.^^,.., n xir^ (10) 

{sj}jei)a\i jeda\i 

where the normalization Z"^* ensures that V'j^* + V'l^/ =1) and 9{x) is the step function. Once these messages 
have been found, the marginal probability that variable i takes value Si, denoted by x*., can be computed as: 



The log-partition function can then be computed using the Bethc formula 

s(A) = log Z(A) = ^ log Z-^ + ^ log Z' - J2 log ^*" , (12) 



where 



z' = n ^i"' + n '/'S^' , (13) 

z-= ^ ^(^-E'^-^^"^«.)^^-E....^.„.„ n^r"' (14) 

r- = xr°^o"* + xi">r* ■ (15) 



IV. QUIET PLANTING IN RANDOM 7^-SAT 

There is a subset A* of the reweighting parameters (depending on the value of K) for which the reweighted random 
iiT-SAT problem has quite astonishing properties. The first such interesting property is that, for the special reweighting 
A = A*, the fixed point of belief propagation equations is factorized, i.e. messages in the fixed point do not depend on 
the edge indexes. We shall use this property as a definition of A* and explore further other properties of this special 
reweighting. 

A. Special reweigthing and quenched-annealed equivalence 

Since in random X-SAT the negations are chosen at random there is a global symmetry between O's and I's. A 
factorized fixed point hence also needs to be symmetric, i.e. '0s~^' = Xs~^° = 1/2- With the use of normalization ([6]) 
it is easy to see that Eqs. ([9|l- (fT0|) permit such a fixed point if and only if A satisfies 

r—1 ^ ^ 

Note that, if we restrict to the power- law choice of the reweighting parameters ([7]) made in [2^, [30], we recover from 
P^ the condition for the special value 7 = 7* used in these works, defined by: 

l = (l + 7*)^-^(l-7*)- (17) 

The Bethe entropy density p^ in the case of a factorized BP fixed point is equal to 

s/sp(A*) = (l-a/01og2. (18) 

Note that in general, if the replica symmetric assumption is correct in the thermodynamic limit, then the Bethe 
entropy is asymptotically equal to the quenched entropy Squcnchcd — limjv_j.oo E[logZ(A*)]/A^. Consequently, as long 
as the replica symmetric assumption at A* is correct, the annealed entropy density is equal to the quenched entropy 
density. It is usually the case that the replica symmetric assumption is correct at low density of constraints, up to a 
point of a thermodynamic phase transition that corresponds to a non-analyticity of the quenched entropy.. 

Let us now compute the annealed entropy density of the reweighted ensemble with arbitrary A, defined as 
Sannoaicd(A) = limAr^oo log E[Z( A)]/7V. Wc get: 



E[Z(A)] = 2 



N 



K -"^ 



e4!U 



= 2'"2-""'\ (19) 



where in the second equality we used the normalization condition ^^. We see that the annealed entropy at any A 
equals the Bethe entropy of the factorized BP solution found for A = A*, Sannoaicd — s/bp(A*). This is actually a 
general result, see [3l|. Consequently, as long as the replica symmetric assumption at A* is correct, the quenched 
entropy density at A* is equal to the annealed entropy density. This last property is the crucial point that makes 
quiet planting in the sense of [13, [23, [2J| possible. 

B. Quiet planting in the reweigthed ensemble 

The definition of the A-reweighted planted ensemble is the following: 

1. Choose the geometry of the random /C-SAT formula as before (M clauses where each contains a random iiT- uple 
of variables). 

2. Define a special configuration, called the planted configuration, by assigning randomly 1/2 of the variables to 
the value and the other 1/2 to 1. 



3. Define the probability p{r) by: 



p{r) == ( ^^ ) A^ for r > , p(0) = . (20) 



For each clause a, choose Va > from this distribution, and then choose at random one out of the (^ ) 
configurations of negations for which the planted configuration satisfies the clause by r^ different variables. 



In this planted ensemble, the probability that a randomly chosen clause is satisfied by r variables from the planted 
configuration is thus equal to p{r) defined in (1^ . 

First of all notice that since every planted configuration is consistent with the same number of instances the planted 
configuration has all the properties of a configuration sampled uniformly at random from the corresponding Boltzmann 
measure /i(A), eq. Q. This is very generic property of planting. Furthermore, it is easy to see that this planting 
procedure generates instances randomly but in general not uniformly over all instances. Every planted instance 
appears with probability proportional to its partition function Z(\). Hence the planted ensemble of instances is in 
general very different from the random ensemble. Only in case when the annealed average of the entropy density 
equals the quenched average and N is large enough the fluctuations of Z{\) are small enough that also the planting 
procedure generates instances equivalent to the random ones (i.e. every property that holds with high probability in 
the random ensemble holds also with high probability in the planted ensemble) as discussed in [19, 23, 32, |3j]. This 
is what we call "quiet planting" and what gives a special role to the parameter A* pB|) . 

To summarize, as long as the replica symmetric assumption is correct at A*, that is before the corresponding 
condensation (static IRSB) phase transition [l3|, planting at A* is quiet which means that it has the following two 
properties: First it creates formulas of the satisfiability problem that in the thermodynamic limit TV — >■ oo have the 
same thermodynamic properties as random formulas. Second, the planted configuration is an equilibrium configuration 
with respect to /i(A*), not with respect to the usually considered uniform measure over all solutions. In the next 
sections we explore consequences of these simple but theoretically intriguing properties. 

C. Cavity equations at A* 

Running BP on a given instance generated with quiet planting is well possible and not very computationally 
demanding. However, more accurate results can be obtained if the same procedure is studied on average over planted 
graphs. The corresponding distributional equations are nothing but the replica-symmetric cavity equations, they can 
be solved efficiently with population dynamics technique J34| . In the present case the object studied by the cavity 
method is the probability that a message arriving on a site i takes the value ip = (?/'o,V'i): conditionally to the fact 
that the planted configuration on i is equal to Si. It satisfies: 

^^.w=En ''"°2 ''"' E^»^^-i'^-) / n E'?('^) n ^..(^'od^'=^-^(^-^(w'=^})) (21) 

J, 3 = 1 {Sj} '' j=l lj=l fcj=l 

where the function T{{ip''^}) is obtained by combining (| WlU)) with ([5]) 

^jA{^'^})=j e ^(^-i-E^^..^.)^i.-i-E--.,,.,n'nv'«^' (22) 

{s,}fS,' = 1 J = l k,=l 

•^w.({^'n) = J E ^K^Y-^s.^,Mi{i^h (23) 

and q{lj) is the Poisson distribution with mean aK. The probability distribution P({sj}, Si) is obtained by choosing 
at first a Ji at random and then a number t of variables Sj's such that Sj ^ Jj following the distribution 

m = J,^^^^ if j.^s., (24) 

P{t)= K — - if J,; = s,. (25) 

Er=o(-^-'')Hn 

Where p{t) is given by ((20|) . The planted initialization then corresponds to P""*(f/') = 6{i/j — u) where Ug^ = 1 and 
ui-si — 0. The average Bethe entropy is expressed along the same lines. 

V. PHASE TRANSITIONS IN THE REWEIGHTED ENSEMBLE, OR HOW TO USE QUIET 

PLANTING 

Following the argumentation in [22] , the quietly-planted satisfiability instances have in general three phase transi- 
tions as the density of constraints increases. The simplest way to locate these phase transitions is to directly perform 



the planting, and then investigate the reweighted BP equations ((8] fT0l) on the resulting instance, with two possible 
initializations. In the first case we initialize messages ^(^^"- and -0"^' as random normalized vectors, and in the second 
case we initialize them in the planted configuration, i.e. Xi^" = 1: Xo^'^ = if variable i was planted 1 or Xi^" = 0, 
XcT*^" = 1 if variable i was planted 0. Let us summarize first the general scenario found in quiet planting. 

We shall say that a phase is "ferromagnetic" when the local expectation of a variable has a positive correlation 
with the planted configuration. This correlation is usually measured by the overlap. So if we call r the planted 
configuration, we shall say that a probability measure is ferromagnetic if the expectation (s^) of variable Si has a 

positive overlap with the planted configuration: limjv-!.oo(l/-^) X]i=i('^«)'''i ^ ^- ^^ ^^^^ overlap is zero we call it a 
paramagnetic phase. As a function of the constraint density a we find four different phases. 

• a < Ud Paramagnetic phase: BP converges towards the same fixed point when started from the two initial- 
izations. The resulting overlap between the BP fixed point and the planted configuration is zero, and the BP 
entropy is equal to the annealed entropy (jlSp . In this phase the set of solutions forms one cluster, two random 
configurations from this cluster are completely uncorrelated, the planted configuration is simply one of them, 
and there is no way to tell which one. 

• ad < a < ac Paramagnetic phase with metastable ferromagnet: The behavior of BP started from the random 
initialization is the same as before, whereas the BP started from the planted initialization converges to a fixed 
point with positive overlap with the planted configuration. The BP entropy of this fixed point is smaller than 
the annealed one and hence the paramagnet is still the dominant thermodynamic solution. In this phase there 
are exponentially many clusters of roughly the same size, the planted configuration belongs to one of them and 
there is no way to tell to which one it belongs. 

• Uc < a < ai Ferromagnetic phase with metastable paramagnet: The behavior of BP started from the random 
initialization is the same as before. The BP started from the planted initialization converges to a fixed point 
with positive overlap with the planted configuration. But this time the entropy of the BP fixed point obtained 
from the planted initialization is larger than the annealed entropy (J18l) . Hence the planted cluster dominates 
the measure. 

9 ai < a Ferromagnetic phase: In this phase the two initializations give the same result in both cases, BP 
converges to a fixed point which is correlated with the planted configuration. Hence it is easy to find a solution 
correlated with the planted configuration. 

In the random i^-SAT ensemble, reweighted with the parameter A*, a^ is the threshold that corresponds to the 
clustering (dynamical IRSB) transition of the measure /^(A*), and ac is the condensation (static IRSB) transition 
of this same measure. In the A* -reweighted planted ensemble the values of ad and ac remain unchanged, but their 
physical interpretation changes. 

The value ad is now the limit of local stability of the planted "ferromagnetic" phase, in the sense that below this 
value the ferromagnetic phase is unstable, ac is the first order phase transition from paramagnet to ferromagnet. The 
phase transition at ai is the limit of local stability of the paramagnetic phase: beyond this point the paramagnetic 
phase is unstable. 

Another important property that follows from the analysis of quiet planting [23 is that the part of the phase 
space that is not correlated to the planted configuration remains unchanged in the whole region of a. Notably for 
a > as (where as is the satisfiability threshold in the random SAT ensemble) only solutions correlated to the planted 
configuration exist. In the random ii'-SAT ensemble without planting, ai does not have a physical interpretations, 
but at this point the BP iterations stop to converge. 

The transition point ai is investigated by computing the local stability of the factorized (uniform) fixed point with 
respect to small random perturbations. The following analytical formula then locates this phase transition: 



«' = T77i:^^TZ5 where x = ^=' )^^' ' + ^-\ /_^^ / - 1 (26) 



In order to locate the other phase transitions, ad and ac, we have solved the cavity equations ((2T |) - ([25l) using the 
population dynamics |34| . 

In Fig. [T] we plot the full phase diagram for quietly planted 3-SAT. In this case the vector A* has only one free 
parameter, we chose it to be A2, the other two components are then given from conditions ([6]) and ([T6|) . The 
phase transition is continuous, i.e. ad — ac — ai, for A2 > 0.118(3). It is discontinuous for A2 smaller than this 
threshold. Within our numerical precision this tri-critical points in random 3-SAT agrees with the value of A2 = 0.1180 
corresponding to the 7* reweighting ([7]). We, however, did not find a theoretical reason for this. 



a 3 




FIG. 1. The phase diagram of quietly-planted reweighted 3-SAT. The phase transitions in constraint density a are plotted as a 
function of the parameter A2 . We remind that A* and A3 are obtained using conditions (|6} and (|16|l . Note that the boundary 
value A2 = in some sense reduces the problem into planted 3-XOR-SAT, and A* = 1/6 to the planted NAE-SAT. Both these 
cases were studied in [35l . [39 ] and phase transition in these cases were known (the SAT/UNSAT transition for these problems 
is shown as a red cross). The SAT/UNSAT transition does not depend on A2, except for the two boundary cases. The black 
dot corresponds to the phase transition for the power-law reweighted ensemble with parameter 7*, which is indistinguishable 
within our accuracy from the point where the transition goes from a first order to a second order one. The shaded region marks 
the part of the phase diagram where planting creates extremely hard satisfiable instances. 



For the general case of X > 3 we keep for simplicity to the power-law reweighting of ([7]) with the balance condition 
(IT71) . as it was used in [23, |30|. Fig. [5] shows, in the case K ~ \, the properties of the BP-fixed points, starting from 
the two possible initial conditions. The overlap of the resulting fixed points with the planted configurations allows 
to find the threshold a^- The difference of the entropies of the paramagnetic and ferromagnetic fixed point allows to 
locate the first-order equilibrium phase transition ccc- 

Table HI summarizes the values of these phase transitions for different values on K. For comparison we also give the 
dynamical and condensation phase transitions in the canonical iiT-SAT that corresponds to Ar = 1/(2^ — 1) for all 
r = \,...,K. 



VI. RESULTS UNVEILED BY QUIET PLANTING 
A. Non-trivial whitening and BP fixed points 



In the standard iiT-SAT problem, without reweighting or planting, the iteration of BP equations starting from a 
generic random initial condition typically always leads to the same fixed-point, that we call the "trivial" fixed point. 
On the other hand, in the clustered phase, there should exist one different and non-trivial BP fixed point for each 
cluster of solutions. In fact the replica symmetry broken description of the clustered phase starts from this assumption 
and then counts the number of non-trivial BP fixed points. The result is an exponentially large number [11, 39]. It 
was a long standing open question why on large instance sizes, even when we are able to find solutions in the clustered 
region, BP initialized in a solution converges back to the trivial fixed point or (in 3-SAT) does not converge at all. 
The same problem existed for graph coloring and for this problem it was resolved in [23, [2J] where it was found that 
the solutions found by slow simulated annealing indeed do not have a corresponding BP fixed point, whereas BP 
initialized in equilibrium solutions (obtained via quiet planting) has indeed a non-trivial fixed point. Whereas one 
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FIG. 2. Phase diagram of the 7* planted 4-SAT. The overlap between the planted configurations and the BP fixed point when 
BP is initialized a) randomly - blue curve - showing a phase transition at ad, b) in the planted configuration - green curve 
- showing a phase transition at a;. A careful reader will remark that the blue curve jumps up in the overlap a bit before 
the constraint density a;, this is dues to a strong finite size correction associated to this phase transition. The red curve is 
the difference between the paramagnetic entropy l|18|) and the entropy corresponding to the BP fixed point obtained from the 
planted initialization. It becomes positive at the first order phase transition a = a^. for a < a^ the paramagnetic phase is the 
stable one, for ct > etc the ferromagnetic phase is the stable one. 
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TABLE I. For various values of K we give the satisfiability threshold as [33, the "quiet" value of the parameter 7*, the 
corresponding dynamical, 0^(7*), condensation, ac(7*), and BP-convergence, qj(7*), thresholds for the reweighted Boltzmann 
measure /i(7*) in random ii'-SAT. Recall that quiet planting generates typical random graphs for density of constraints smaller 
than ac(7*) (5th column). For comparison we remind the dynamical and condensation thresholds for the A'-SAT with no 
reweighting, i.e. for the measure /i(A), from 13]. The 9th column gives the density of constraints o.gp.^A'y*) beyond which 

even canonical BP (at A) converges to a non-trivial fixed point if initialized in the planted configuration. The 10th column is 
taken from [SOl. l3q|. it is the rigorous lower bound on the satisfiability threshold, a2nd(7*), that is obtained by computation 
of the reweighted second moment. The work of [30, ISg] also proves rigorously that the annealed entropy density equals the 
quenched one in the range a < Q:2nd(7*). Our physical arguments lead us to conjecture that actually the annealed entropy 
density equals the quenched one in the whole range a < Qc(7*) from the 5th column, but we do not have a rigorous proof of 
this result. The 11th column is the constraint density beyond which the planted solution lies in a frozen cluster. 



could conjecture that the same reason applies to iC-SAT there was no tool to check this explicitly. The quiet planting 
that we study in this paper permits such an explicit check for a first time. Indeed, the planted graphs at A* are 
equivalent to non- planted random graphs for a < ac{X*). This gives us a tool to generate both a typical instance 
of X-SAT, and a typical solution, in the regime a < etc (A*). In this regime, if we start from the typical solution 
that wc have obtained through planting, and then iterate the canonical BP equations (i.e. BP at A), we observe 
that this canonical BP converges to a non-trivial fixed point whenever a > ctBP(x)('y*)- From Table [J we see that 
"^BPfAl (')'*) ^ CKc(7*) for K > 5. This shows that we have a non-empty regime where we can find non-trivial BP fixed 
points. Note, however, that in this case (unlike for the coloring) the planted configuration is not an equilibrium one 
for the Boltzmann measure /i(A), hence the corresponding non-trivial BP fixed point does not describe the equilibrium 
cluster relative to the original, flat, measure. Note also that the fact that for K > 3 the ctBp(x){l*) < ad(A) is a 
direct evidence for the presence of smaller Gibbs states in the paramagnetic region where a large single Gibbs state 
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still dominates the fi{X) measure. 

Freezing of variables was argued to be an important ingredient in understanding the relation between the structure 
of solutions and the algorithmic hardness [40|- A variable is frozen in a cluster if it takes the same value in all 
solutions belonging to that cluster. Within the assumptions of the cavity method one can investigate if a solution 
belongs to a frozen cluster or not by running BP initialized in the solution and monitoring whether beliefs stay 
completely polarized or not. In fact a sim pler version of BP can be written for such monitoring, it is called warning 
propagation or whitening in the literature [4l|-|4J| . There is a long-standing open question in the field which concerns 
this whitening j4Q|]. Whereas, for instance, the survey-propagation algorithm is based on the existence of solutions 
with a non-trivial whitening result, when one tries to run the whitening algorithm on solutions found by heuristic 
algorithms, the whitening result on large instances was always observed to be trivial. In [2^ it was shown for the 
coloring problem that equilibrium solutions do indeed have a non-trivial whitening, as expected from the theoretical 
calculations. This work has also shown that theoretical reasons exist for the solutions found by local algorithm not to 
have a corresponding non-trivial whitening. Now we show that also random iiT-SAT formulas do have solutions with 
a non-trivial whitenings. 

Importantly, warning propagation has no dependence on A, hence if a cluster is frozen at one value of A is must be 
frozen for all A. Table U gives the values of a beyond which the planted solution lies in a frozen cluster. We apply 
the quiet planting procedure, which generates a typical instance in which the planted solution is also typical, as long 
as a < ac(7*). We see that, for K > 6, the planted solution lies in a frozen cluster on typical random instances of 
K-SAT (in the range 0/(7*) < a < ad^*)). 



g 

o 

CO 

c 

CD 
N 
O 

LL 




\L/ \l/ M/ M/ \J/ M/ 
7N 7K 7T\ 7K ~7K 7T\ 



8 10 12 14 16 
BP iterations 



18 20 



FIG. 3. Non-trivial whitening in the random 6-SAT problem: here A^ — 10^, the graphs were planted, but are equivalent to 
random ones thanks to quiet planting property. When a > 38 a non trivial whitening core is obtained for the planted solution. 
No solution to a large instance of random ii'-SAT problem with a non-trivial whitening core had ever been found so far. 



B. Direct evidence for purely entropic barriers 



Often researchers define clusters not as BP fixed points, nor as Monte Carlo ergodic components reachable in 
polynomial time, but as connected components in an auxiliary graph where vertices are satisfying configurations and 
edges are put between vertices that differ in the value of only one variable. It has been pointed out on many occasions 
that there might be important differences between these definitions. In particular, it was conjectured that within the 
dynamical definition clusters could be connected via exponentially narrow entropic bridges. Up to date there was no 
direct evidence for the validity of this conjecture. Here we give one. Consider a planted instance in the regime where 
ac(A*) > a > ad{X*) and simultaneously a < ctBP(x)('y*) (^'-'^ values see TablelJ). The set of satisfying configurations 
("solutions") is independent of A. Now we shall perform two types of Monte-Carlo random walks in the space of 
solutions. Both of them are initialized in the planted configuration. In the first case we use a Monte Carlo random 
walk that satisfies the detailed balance condition with respect to the measure /x(A*). In the second case, we consider 
a Monte Carlo that satisfies the detailed balance with respect to the /^(A) Boltzmann measure. In the first case the 
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random walk will be restrained in proximity of the planted configuration and stay there for a time that diverges in 
the thermodynamic limit (probably faster than any power law). In the second case the Monte Carlo will be able to 
diffuse far away from the planted configuration. This means that it manages to find some paths that were very narrow 
or rare in the measure /i(A*) but are now easy to find in //(A). Using these paths, the second procedure escapes to 
a large distance from the planted configuration. Completely analogous dependence can be seen in the iterations of 
the BP equations at A* and at A. These experiments are illustrated in Fig. 2] They show the existence of entropic 
barriers. 
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FIG. 4. Illustration of the existence of purely entropic barriers: a solution is planted using ')* , a — 8.2, A'^ — 10^, K = A. Then a 
Monte-Carlo simulation satisfying detailed balance for the measure /i(A*) is performed starting from this planted configuration: 
the system is trapped in a Gibbs state and does not exit in sub-exponential time. Switching then to a new Monte Carlo, which 
satisfied detailed balance for the measure /i(A), i.e. the original SAT Hamiltonian, makes it very easy for the dynamics to exit 
this state and to sample the larger paramagnetic state. This demonstrate the presence of entropic barriers and the fact that 
the dynamics can be trapped even in absence of frozen variables or energetic barriers. 



C. How hard is A*-planted A'-SAT 



Hardness of the 7*-planted random 3-SAT instances was investigated in [28| for DPLL, WalkSAT and Survey Prop- 
agation algorithm. The authors found empirically a region where all the three algorithms fail or scale exponentially 
with the size of the system. Our results actually show that for simulated annealing or for belief propagation decima- 
tion random 7*-planted 3-SAT is easy. Indeed, for a > a; (7*) the BP and MCMC dynamics is attracted close to the 
planted configuration and hence finding a solution nearby is easy. For a < 0/(7*) the planted formula is equivalent 
to a random formula, and it is so sparse that MCMC will equilibrate in linear time and hence simulated annealing 
will find a solution in linear time, in this region also BP decimation works [4^, \^. On the other hand for i^T > 4 
the 7*-planted A'-SAT has a hard region for values of as < a < a;(7*). Indeed for a < q:z(7*) the planted cluster is 
hidden to the dynamics and for a > as there are no solutions other than those belonging to the planted cluster. 

The algorithmic hardness of A*-planted 3-SAT formulas was studied in [23] using a walk-SAT algorithm. We revisit 
the corresponding phase diagram in the view of our results. We show that the conclusions of 1271 were qualitatively 
correct, but the correct boundary of the "hard" region is different from what was estimated in \2T\ (as the statistical 
physics calculations in that work were only approximative). Most importantly we give further theoretical justification 
for why the A*-planted random 3-SAT formulas are (together with benchmarks from 29]) the hardest satisfiable 
formulas on the market of hard satisfiable benchmarks. 

We remind that the space of solution in the A*-plantcd formulas can be split in two parts. The first part includes 
all the solutions (satisfying assignments) that would exist anyway in the non-planted random 3-SAT. The second part 
includes solutions correlated to the planted configuration. Algorithmic search for a solution belonging to the first part 
is just as hard (or as easy) as it is in the canonical random iiT-SAT. Notably for a > as this space is empty (canonical 
random SAT is unsatisfiable) . It follows from our results above that for a > a; is is easy to find a solution correlated 
to the planted configuration, to do so one can use A*-reweighted BP or A*-reweighted MCMC. We argue that for 
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a < ai finding solutions correlated to the planted one is hard. Finding a configuration (not necessarily a solution) 
correlated to the A*-planted configuration may be seen as a inference problem and running A*-reweighted MCMC is 
then a Bayes optimal algorithm for this inference problem. The running time, however, becomes exponentially large 
below the transition a; (and above ad)- Clearly if it is exponentially hard to find any configuration correlated to the 
planted configuration, it will be even harder to find a solution correlated to the planted configuration. To conclude 
the boundary of the hard region is given by a < a/ in order not to find the planted configuration, and at the same 
time a sufficiently large for the canonical random /C-SAT to be hard or better not to have any solutions at all for 
a> as- 

Let us also describe explicitly the relation between the benchmarks described here and those of [29|. Instances 
generated by the A*-planting with A2 close to zero can be seen as a planted XOR-SAT formulas with "a bit of 
nonlinearity" . If moreover we restrict ourselves to the regular formulas where every variable appears in the same 
number of clauses then we are exactly in the setting of [29| who introduced planted regular XOR-SAT formulas with 
"a bit of nonlinearity" as the hardest known satisfiable benchmarks. Moreover the kind of nonlinearity that quiet 
planting adds by taking a nonzero value of A2 is not easy to discover, even if the exact protocol of how the instances 
were created was known. Note that the regular instances are taken in order to lower the total number of variables (e.g. 
the leaves of random instances do not contribute to the overall hardness and can hence be omitted). The analysis 
presented in this paper can be used for regular instances straightforwardly. 

VII. CONCLUSIONS 

In this paper, we have studied the reweighted measure over solutions of the random ^-satisfiability problem, where 
each solution is reweighed according to the number of variables that satisfy every clause. This problem has been 
addressed both analytically, using the cavity method, and numerically, using belief propagation and Monte-Carlo 
sampling. 

The main results of this study are the following: (i) The reweighing allows to introduce a planted ensemble that 
generates instances that are equivalent to random instances in some region of the phase diagram. In this case, we 
are thus able to generate simultaneously a typical SAT instance and one of its solutions, (ii) We have used this 
property to give a direct evidence for the existence of purely entropic (rather than energetic) barriers between clusters 
in some region of the phase diagram. This is a fundamental point demonstrating that the physical definition of 
clusters as Gibbs states (or belief propagation fixed points) is not equivalent to the geometric definition of clusters 
as disconnected components in the solution space. Such equivalence was often wrongly assumed in the literature and 
leads to confusions, (iii) We have used the quiet planting property to display explicitly, for the first time, solutions 
of large random K-SAT problems leading to a non-trivial whitening core: while such solutions where known to exist, 
they were so far never observed on large instances, (iv) We discuss the algorithmic hardness of these planted instances 
and determine a region of parameter in which quiet planting leads to hard satisfiable benchmarks. 
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